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For data analysis of large-scale experiments such as LHC Atlas and other Japanese high energy and nuclear 
physics projects, we have constructed a Grid test bed at ICEPP and KEK. These institutes are connected to 
national scientific gigabit network backbone called SuperSINET. In our test bed, we have installed NorduGrid 
middleware based on Globus, and connected 120TB HPSS at KEK as a large scale data store. Atlas simulation 
data at ICEPP has been transferred and accessed using SuperSINET. We have tested various performances 
and characteristics of HPSS through this high speed WAN. The measurement includes data access perforance 
comparison between connections with low latency LAN and long distant WAN. 



1. Introduction 

For the Atlas Japan collaboration, International 
Center for Elementary Particle Physics, University 
of Tokyo(ICEPP) will build a "Tier-1" regional cen- 
ter and High Energy Accelerator Research Organiza- 
tion(KEK) will build a "Tier-2" regional center for the 
Atlas experiment of the Large Hadron Collider (LHC) 
project at CERN. The two institutes are connected by 
the SuperSINET which is an ultrahigh-speed network 
for Japanese academic researches. On the network a 
Grid test bed was constructed to study requisite func- 
tionality and performance issues for the tiered regional 
centers. 

High Performance Storage System (HPSS) with 
high density digital tape libraries could be a key com- 
ponent to handle petabytes of data produced by At- 
las experiment and to share such data among the re- 
gional collaborators. HPSS parallel and concurrency 
data transfer mechanisms, which support disk, tape 
and tape libraries, are effective and scalable to sup- 
port huge data storage. This paper describes about 
integration of HPSS into a Grid architecture and the 
performance measurement of HPSS in use over a high- 
speed WAN. 



2. Test bed system 

The computer resources for the test bed were in- 
stalled in ICEPP and KEK site. One Grid server in 
each site and HPSS servers in KEK were connected 
with 1-Gbps Ethernet through the SuperSINET. All 
resources including network were isolated from other 
users and dedicated for the test. Figure □ and Table ID 
shows our hardware setup. 



Three storage system components were employed. 
A disk storage server shared its host with the Grid 
server each at KEK and ICEPP. The remaining HPSS 
software components were used in the KEK Central 
Computer system. The HPSS data flow is depicted 
in Fig. |21 The HPSS Servers including core servers, 
disk movers, and tape movers are tightly coupled by 
an IBM SP2 cluster network switch. 

In the case of original pftp (parallel ftp with Ker- 
beros authentication) server performance measure- 
ment, pftpd was run in the core HPSS server. In 
the case of GSI-enabled HPSS server which will be 
mentioned in the section^ pftpd was run in the same 
processors with the disk mover. The disk movers were 
directly connected to the test bed LAN through their 
network interface cards. Two HPSS disk movers were 
dedicated to the test. 

NorduGrid middleware ran on the Grid servers. 
Other computing elements (CE) acted as a Portable 
Batch System (PBS) |l| that was not required to in- 
stall with the NorduGrid middleware. 

The NorduGridQ is a pioneer Grid project in Scan- 
dinavia that added upper layer functionality, which is 
necessary to HEP computing, on the Globus tool kit. 
The middleware was simple to understand and offered 
functionality sufficient for our test bed study. 

Table [H] shows the versions of middleware used in 
the test bed. 



3. HPSS performance over high-speed 
WAN 

3.1 . Basic network performance 

Before end-to-end measurement, basic Gigabit Eth- 
ernet performance between IBM HPSS servers and a 
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Table I Test bed Hardware 



Table II Test bed Software 



ICEPP 



KEK 



Grid and PBS server 

1 x Athlon 1.7GHz 2CPU 
Computing Element 

4 x pentium III 1.4GHz 2CPU 



Grid and PBS server 

1 x Pentium III 1GHz 2CPU 
Computing Element 

50 x pentium III 1GHz 2CPU 
HPSS disk mover 

2 x Power3 375MHz 

HPSS tape mover and Library 
19 x Power3 375MHz, IBM 3590 



GRID testbed environment 
with HPSS through GbE-WAN 




Figure 1: Layout of the test bed hardware 



host at ICEPP through the WAN as well as a host 
on the KEK LAN was measured using netperf |2|. It 
is shown in Figure [3] as a function of the TCP buffer 
size of the client. Round Trip Time (RTT) averaged 
was 3 to 4 ms. The network quality of service was 
quite good and almost free from packet loss (< 0.1%). 
In this measurement, maximum TCP window size in 
HPSS server had 256kB (the buffer size of 256kB op- 
timized to IBM SP2 switching network). The clients 
at both KEK and ICEPP had 64MB. Due to rather 
slower clock-speed proceessors on the HPSS servers 
the maximum raw TCP transfer performance was lim- 
ited below lGbps. As seen in the graph, network ac- 
cess performance through both LAN and WAN be- 
came almost equivalent and saturated beyond 0.5MB 
buffer size. 

Figure 01 shows the network performance with the 
number of simultaneous transfer sessions through the 
WAN. In the situation where TCP buffer size was 
100KB, up to 4 parallel simultaneous stream sessions 
improved network throughput. Using greater buffer 
size than 1MB, multiple stream sessions did not im- 



software 


version 


Globus 


2.2.2 


NorduGrid 


0.3.12 


PBS 


2.3.16 


HPSS 


4.3 



Clients HPSS servers 

Disk Movers 
Computing Element (Disk Cache) Ta P e Movers 



in ICEPP/KEK 




CE 

(Gridftp client) 

2CPUPenIII 1GHz 
RedHat 7.2 Globus 2.2 



Disk mover 
GSIpftp Server 



2CPU Power3 375MHz 
AIX 4.3 HPSS 4.3 Globus 2.0 



Shared by many users 



Tape: 3590 ( 14MB/s 40GB) 

S) 



Tape movers 




2 CPU Power3 375MH^ x 3 , 
AIX 4.3 HPSS 4.3 



Figure 2: HPSS players. 

prove the aggregate network transfer speed. 

Network performance by netperf 

^600 
pa 




HPSS mover eb KEK client 
HPSS movers ICEPP client 
HPSS mover Bufsize=256kB 



0.2 0.4 0.6 0.8 1 
TCP Buffer size of Client (MByte) 

Figure 3: Basic GbE network transfer speed measured by 
netperf. 



3.2. HPSS client API performance 

Figure [S] shows data transfer speed by using the 
HPSS client API and comparison between access from 
LAN and over WAN. The transfer was from/to the 
disk of HPSS disk-mover disk to/from client host 
memory. The transferred file size was 2GB in all case. 
Disk access speed in the disk mover was 80MB /s. 
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Network transfer with # of TCP session 



600 



8 400 
a. 



,200 



Client Buffer size = 1MB 

/* WAN 

/ Client Buffer size = 100KB 



• ICEPP client — ► KEK HPSS mover 
Buffer size HPSS mover = 256kB 







4 6 
# of TCP session 



Figure 4: Network performance with no. of TCP stream 
sessions measured by netperf. 



It shows that even with a larger API buffer size in 
the client API, WAN access speed was about a half 
of LAN access both for reading and writing from/ to 
HPSS server. 

To increase HPSS WAN performance in future tests, 
the newer pdata protocol provided in HPSS 4.3 can 
be employed. This will improve pget performance. To 
get the same effect on pputs, the pdata-push protocol 
provided in HPSS 5.1 is required. 

The existing mover and pdata protocols are driven 
by the HPSS mover with the mover requesting each 
data packet by sending a pdata header to the client 
mover. The client mover then sends the data. This 
exchange creates latency on a WAN. The pdata-push 
protocol allows the client mover to determine the 
HPSS movers that will be the target of all data pack- 
ets when the data transfer is set up. This protocol 
eliminates the pdata header interchange and allows 
the client to just flush data buffers to the appropri- 
ate mover. The result is that the data is streamed to 
the HPSS mover by TCP at whatever rates it can be 
delivered by the client side mover and written to the 
HPSS mover devices. 



3.3. pftp client 
speed 



pftpd server transfer 



Figure shows data transfer speed by using HPSS 
pftp from HPSS disk mover to client /dev/null dummy 
device. Again as in the previous HPSS client API 
transfer, even with a pftp buffer size of 64MB, access 
speed from WAN was about a half of LAN access. 
In addition, enabling single file transfer with multiple 
TCP stream by using the pftp 'pwidth' option was 
not effective in our situation. In our server layout, two 
disk mover hosts each had two RAID disks. Therefore, 
up to 4 concurrent file transfers could effect higher 
network utilization and overall throughput, and was 
so seen in WAN and LAN access case. In the same 
figure (Fig. [fj) data transfer speed was shown from 
HPSS disk mover to the client disks which had writ- 
ing performance of 35-45MB/s. Though disks both in 
server and client hosts had the access speed exceed- 
ing 30MB/s and also network transfer speed exceeded 
80MB/s, overall transfer speed dropped to 20MB/s. 
It is because these three resources were not accessed 
in parallel but in series. 

Figure shows elapsed time for accessing data in 
tape library. Thanks to HPSS functionality and an 
adequate number of tape movers and tape drives, the 
data I/O throughput scaled with the number of con- 
current file transfers. However, since the library had 
only two accessors and could load upto two tape cas- 
sctcs to drives simultaneously, in the case where data 
in more than three off-drive tapes is required to access, 
the throughput goes down. 

Comparison of writing to HPSS disk mover from 
client over WAN and LAN is shown in Fig. |SJ In the 
figure, 'N files — > N files', for example, means that 
'reading' N files simultaneously at client and 'writing' 
N files to the server. In our setup, HPSS server had 
4 independent disks but client had only one. Reading 
multiple files in parallel from a single disk at client side 
degrades the aggregate performance by contention of 
disk heads. 
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HPSS Client API performance 



ICEPP client read 
KEK client lead 
ICEPP client write 
KEK client write 
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100 200 

Buffer size (MB) 
Figure 5: HPSS client API performance 



pftpd— >pftp HPSS mover disk — > Client 
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KEK client (LAN) 
ICEPP client(WAN) 



to client disk TCP buffer=64MB 
client disk s^eed 35-45MB/s 



10 



# of file transfer in parallel 
Client disk speed @ KEK = 48MB/S 
Client disk speed @ ICEPP=33MB/s 

Figure 6: Performance pftpd-pftp client read to client 
/dev/null and disk 
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pftpd — > pftp read performance 



3(10 
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- data was on HPSS mover disk 

- data was in HPSS mover mounted tape 

- data was in HPSS mover unmounted tape 
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Figure 7: pftpd-pftp read to client disk from tape archive 
performance 

pftp — > pftpd; Client disk — » Mover disk 
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-r> - ICEPP client; from single file 
-a- - ICEPP client; from multiple files 
~~ • KEK client; from single file 
— • KEK client; from multiple files 
-^>— ICEPP client; pwidth 1->1 
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Figure 8: pftpd-pftp write to server cache-disk 
performance 



4. GSI-enabled pftp 

GridFTP 3] is a standard protocol for building data 
GRID and supports the featues of Grid Security In- 
frastructure (GSI), multiple data channels for paral- 
lel transfers, partial file transfers, third-party transfer 
and reusable and authenticated data channels. 

The pftp and ftp provided with HPSS software was 
not required or designed to support data Grid in- 
frastructure. For future releases, HPSS Collabora- 
tion Members have introduced data Grid pftp require- 
ments and the HPSS Technical Committee (TC) has 
convened a Grid Working Group to propose a de- 
velopment plan. As an interim and partial HPSS 
data Grid interface solution, the HPSS Collabora- 
tion is distributing the GSI-enabled pftp developed by 
Lawrence Berkeley National Laboratory (LBL). The 
HPSS TC is also working with the GridFTP develop- 
ment project underway at Argonne National Labora- 
tory. 

To acquire an HPSS data Grid interface necessary 
for our test bed, we requested and received a copy of 



latest version of GSI-enabled pftp. The protocol itself 
is pftp but it supports GSI-enabled AUTH and ADAT 
ftp-command. 

Table IIIII shows commands in each FTP protocol. 
While GSI-enabled pftp and GridFTP have different 
command set for parallel transfer, buffer management 
and Data Channel Authentication (DC A), the base 
command set is common. Fortunately unique func- 
tions of each protocol are optional and the two pro- 
tocols are able to communicate. Installing and test- 
ing the GSI-enabled pftp, we proved that the GSI- 
enabled pftp daemon from LBL could be successfully 
accessed from gsincftp and globus-url-copy with no 
dcau option(standard globus client utilities). From 
NorduGrid, the server was accessible as well. The be- 
low is a sample XRSL (Extended Resource Specifica- 
tion Language) which utilize GSI-enabled pftp server 
as a storage element(SE) of the NorduGrid. 



A sample XRSL 



&(executable=gsiml) 
(arguments= ' ' -d' ' ) 
(inputf iles= 
("Bdata.in" 

"gsi\protect\vrule widthOpt\protect\h.ref {f tp : //dt05s . cc : 1\ 

(stdout=dataf iles . out) 

(join=true) 

(maxcputime="36000") 

(middleware="nordugrid" ) 

(jobname="HPSS access test") 

(stdlog="grid_debug")% 

(ftpThreads=l) 

In the performance measurement with 2GB file 
being accessed from 'pftp client', GSI-enabled pftp 
server and normal kerberos pftp server had equivalent 
elapsed data transfer time in any situation. Accessing 
from 'Grid-FTP client', GSI-enabled pftp server and 
normal pftp server, as well, had equivalent transfer 
time in usual. However, in the case where multiple 
disk movers were utilized and accessed data and GSI 
enabled-pftpd server resided in separated disk movers, 
transfer speed was halved. Figure |5] shows aggregate 
transfer speed over the number of independent simul- 
taneous file transfer and shows the situation. After in- 
vestigating the detailed communication between client 
and server, we found the differnece behaviour of the 
two servers. In original pftp where pftpd running in 
HPSS core server, data path is directly established be- 
tween pftp client and disk mover. On the other hand, 
GSI-enabled pftp, data flow was from disk mover, via 
pftpd to client host. When the disk mover and pftpd 
server do not reside in the same host, two successive 
network transfer are incurred. 
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Table III Commands in FTP protocol 



GridFTP 


GSI-enbled pftp 


SPAS,SPOR,ETET 
ESTO,SBUF,DCAU 


PBSZ,PCLO,PORPN, 
PPOR,PROT,PRTR,PSTO 


AUTH,ADAT 
RFC959 commands 



Gridftp client and GSI-pftp server 
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disk mvcr (!=pftpd) — > client pftpd— *pftp 

- disk mvcr (=pftpd) — » client GSI-pftpd— "globus-url-copy 

- disk mvcr (!=pftpd)-^ client GSI-pftpd— 'globus-url-copy 



dsk mver(!=pftpd) to clnt; GSIftpd— *urlcopy 



• — * 



dsk mver(=pftpd) to clnt; GSIftpd— »urlcopy 
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Figure 9: Read performance from GSI-enabled pftpd 
server to Gridftp client 



5. summary 

ICEPP and KEK configured NorduGrid test bed 
with HPSS storage server over high speed GbE WAN. 



Data access performance was measured with several 
system configurations in comparison between LAN 
and WAN access. From that, we found that network 
latency affected data transfer speed with HPSS pftp 
and client API. The "GSI-enabled pftpd" developed 
by LBL was successfully adapted to the interface be- 
tween Grid infrastructure and HPSS. 

Our paper is a report on work-in-progress. Final 
results require that the questions relative to raw TCP 
performance, server/client protocol traffic, and pftp 
a protocol be further evaluated; that any necessary 
modifications or parametric changes be acquired form 
our HPSS team members; and that measurements be 
taken again. Further understanding of the scalabil- 
ity and the limitation of multi-disk mover configura- 
tions would be gained by measuring HPSS network 
utilization and performance using higher performance 
network interfaces adapters, system software and in- 
frastructure, and processor configurations. 



References 



http : / / www .op enpbs.org 
http:// www . netperf . org 



[1] 

[2] . 

[3] http://www.globus.org/datagrid/gridftp.html 
[4] http://www.sdsc.edu/hpss/ 

[5] http://www.nordugrid.org, You can find Nor- 
duGrid papers in this proceedings too. 

[6] S.Yashiro et. al., "Data transfer using buffered I/O 
API with HPSS" , CHEP'01, Beijing, Jul.2001 



THCT002 



pftpd— >pftp HPSS mover disk — > 




# of file transfer in parallel 
Client disk speed @ KEK = 48MB/s 

Client disk speed @ ICEPP=33MB/s 



